A Proposal for a Heterogeneous Cluster ScaLAPACK (Dense Linear Solvers)
نویسندگان
چکیده
ÐIn this paper, we study the implementation of dense linear algebra kernels, such as matrix multiplication or linear system solvers, on heterogeneous networks of workstations. The uniform block-cyclic data distribution scheme commonly used for homogeneous collections of processors limits the performance of these linear algebra kernels on heterogeneous grids to the speed of the slowest processor. We present and study more sophisticated data allocation strategies that balance the load on heterogeneous platforms with respect to the performance of the processors. When targeting unidimensional grids, the load-balancing problem can be solved rather easily. When targeting two-dimensional grids, which are the key to scalability and efficiency for numerical kernels, the problem turns out to be surprisingly difficult. We formally state the 2D load-balancing problem and prove its NP-completeness. Next, we introduce a data allocation heuristic, which turns out to be very satisfactory: Its practical usefulness is demonstrated by MPI experiments conducted with a heterogeneous network of workstations. Index TermsÐHeterogeneous network, heterogeneous grid, different-speed processors, load-balancing, data distribution, data allocation, numerical libraries, numerical linear algebra, heterogeneous platforms, cluster computing.
منابع مشابه
HeteroMPI+ScaLAPACK: Towards a ScaLAPACK (Dense Linear Solvers) on Heterogeneous Networks of Computers
The paper presents a tool that ports ScaLAPACK programs designed to run on massively parallel processors to Heterogeneous Networks of Computers. The tool converts ScaLAPACK programs to HeteroMPI programs. The resulting HeteroMPI programs do not aim to extract the maximum performance from a Heterogeneous Networks of Computers but provide an easy and simple way to execute the ScaLAPACK programs o...
متن کاملEfficient Parallel Solvers for Large Dense Systems of Linear Interval Equations
Verified solvers for dense linear (interval-)systems require a lot of resources, both in terms of computing power and memory usage. Computing a verified solution of large dense linear systems (dimension n > 10000) on a single machine quickly approaches the limits of today’s hardware. Therefore, an efficient parallel verified solver for distributed memory systems is needed. In this work we prese...
متن کاملStatic LU Decomposition on Heterogeneous Platforms
In this paper, the authors deal with algorithmic issues on heterogeneous platforms. They concentrate on dense linear algebra kernels, such as matrix multiplication or LU decomposition. Block-cyclic distribution techniques used in ScaLAPACK are no longer sufficient to balance the load among processors running at different speeds. The main result of this paper is to provide a static data distribu...
متن کاملFast (Parallel) Dense Linear System Solvers in C-XSC Using Error Free Transformations and BLAS
Existing selfverifying solvers for dense linear (interval-)systems in C-XSC provide high accuracy, but are rather slow. A new set of solvers is presented, which are a lot faster than the existing solvers, without losing too much accuracy. This is achieved through two main changes. First, an alternative method for the computation of exact dot products based on the DotK-Algorithm is implemented. ...
متن کاملA Note on ScaLAPACK's Banded System Solvers
We suggest modiications in the local computations of the ScaLAPACK subroutines for solving diagonally dominant and arbitrary narrow-banded linear systems. The modiications concern the way auxiliary variables are stored. The numerical properties of the algorithms are not aaected. However, as the way the memory is accessed is changed the performance of the solvers is signiicantly improved. We dis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Computers
دوره 50 شماره
صفحات -
تاریخ انتشار 1999